14,422 research outputs found
AMR Parsing via Graph-Sequence Iterative Inference
We propose a new end-to-end model that treats AMR parsing as a series of dual
decisions on the input sequence and the incrementally constructed graph. At
each time step, our model performs multiple rounds of attention, reasoning, and
composition that aim to answer two critical questions: (1) which part of the
input \textit{sequence} to abstract; and (2) where in the output \textit{graph}
to construct the new concept. We show that the answers to these two questions
are mutually causalities. We design a model based on iterative inference that
helps achieve better answers in both perspectives, leading to greatly improved
parsing accuracy. Our experimental results significantly outperform all
previously reported \textsc{Smatch} scores by large margins. Remarkably,
without the help of any large-scale pre-trained language model (e.g., BERT),
our model already surpasses previous state-of-the-art using BERT. With the help
of BERT, we can push the state-of-the-art results to 80.2\% on LDC2017T10 (AMR
2.0) and 75.4\% on LDC2014T12 (AMR 1.0).Comment: ACL202
Stereo Matching by Joint Energy Minimization
In [18], Mozerov et al. propose to perform stereo matching as a two-step
energy minimization problem. For the first step they solve a fully connected
MRF model. And in the next step the marginal output is employed as the unary
cost for a locally connected MRF model.
In this paper we intend to combine the two steps of energy minimization in
order to improve stereo matching results. We observe that the fully connected
MRF leads to smoother disparity maps, while the locally connected MRF achieves
superior results in fine-structured regions. Thus we propose to jointly solve
the fully connected and locally connected models, taking both their advantages
into account. The joint model is solved by mean field approximations. While
remaining efficient, our joint model outperforms the two-step energy
minimization approach in both time and estimation error on the Middlebury
stereo benchmark v3
Deep Feature Based Contextual Model for Object Detection
Object detection is one of the most active areas in computer vision, which
has made significant improvement in recent years. Current state-of-the-art
object detection methods mostly adhere to the framework of regions with
convolutional neural network (R-CNN) and only use local appearance features
inside object bounding boxes. Since these approaches ignore the contextual
information around the object proposals, the outcome of these detectors may
generate a semantically incoherent interpretation of the input image. In this
paper, we propose an ensemble object detection system which incorporates the
local appearance, the contextual information in term of relationships among
objects and the global scene based contextual feature generated by a
convolutional neural network. The system is formulated as a fully connected
conditional random field (CRF) defined on object proposals and the contextual
constraints among object proposals are modeled as edges naturally. Furthermore,
a fast mean field approximation method is utilized to inference in this CRF
model efficiently. The experimental results demonstrate that our approach
achieves a higher mean average precision (mAP) on PASCAL VOC 2007 datasets
compared to the baseline algorithm Faster R-CNN
Learning Graph-Level Representation for Drug Discovery
Predicating macroscopic influences of drugs on human body, like efficacy and
toxicity, is a central problem of small-molecule based drug discovery.
Molecules can be represented as an undirected graph, and we can utilize graph
convolution networks to predication molecular properties. However, graph
convolutional networks and other graph neural networks all focus on learning
node-level representation rather than graph-level representation. Previous
works simply sum all feature vectors for all nodes in the graph to obtain the
graph feature vector for drug predication. In this paper, we introduce a dummy
super node that is connected with all nodes in the graph by a directed edge as
the representation of the graph and modify the graph operation to help the
dummy super node learn graph-level feature. Thus, we can handle graph-level
classification and regression in the same way as node-level classification and
regression. In addition, we apply focal loss to address class imbalance in drug
datasets. The experiments on MoleculeNet show that our method can effectively
improve the performance of molecular properties predication.Comment: arXiv admin note: text overlap with arXiv:1703.00564,
arXiv:1611.03199 by other author
The Forgettable-Watcher Model for Video Question Answering
A number of visual question answering approaches have been proposed recently,
aiming at understanding the visual scenes by answering the natural language
questions. While the image question answering has drawn significant attention,
video question answering is largely unexplored.
Video-QA is different from Image-QA since the information and the events are
scattered among multiple frames. In order to better utilize the temporal
structure of the videos and the phrasal structures of the answers, we propose
two mechanisms: the re-watching and the re-reading mechanisms and combine them
into the forgettable-watcher model. Then we propose a TGIF-QA dataset for video
question answering with the help of automatic question generation. Finally, we
evaluate the models on our dataset. The experimental results show the
effectiveness of our proposed models
Density Sensitive Hashing
Nearest neighbors search is a fundamental problem in various research fields
like machine learning, data mining and pattern recognition. Recently,
hashing-based approaches, e.g., Locality Sensitive Hashing (LSH), are proved to
be effective for scalable high dimensional nearest neighbors search. Many
hashing algorithms found their theoretic root in random projection. Since these
algorithms generate the hash tables (projections) randomly, a large number of
hash tables (i.e., long codewords) are required in order to achieve both high
precision and recall. To address this limitation, we propose a novel hashing
algorithm called {\em Density Sensitive Hashing} (DSH) in this paper. DSH can
be regarded as an extension of LSH. By exploring the geometric structure of the
data, DSH avoids the purely random projections selection and uses those
projective functions which best agree with the distribution of the data.
Extensive experimental results on real-world data sets have shown that the
proposed method achieves better performance compared to the state-of-the-art
hashing approaches.Comment: 10 page
A Revisit on Deep Hashings for Large-scale Content Based Image Retrieval
There is a growing trend in studying deep hashing methods for content-based
image retrieval (CBIR), where hash functions and binary codes are learnt using
deep convolutional neural networks and then the binary codes can be used to do
approximate nearest neighbor (ANN) search. All the existing deep hashing papers
report their methods' superior performance over the traditional hashing methods
according to their experimental results. However, there are serious flaws in
the evaluations of existing deep hashing papers: (1) The datasets they used are
too small and simple to simulate the real CBIR situation. (2) They did not
correctly include the search time in their evaluation criteria, while the
search time is crucial in real CBIR systems. (3) The performance of some
unsupervised hashing algorithms (e.g., LSH) can easily be boosted if one uses
multiple hash tables, which is an important factor should be considered in the
evaluation while most of the deep hashing papers failed to do so.
We re-evaluate several state-of-the-art deep hashing methods with a carefully
designed experimental setting. Empirical results reveal that the performance of
these deep hashing methods are inferior to multi-table IsoH, a very simple
unsupervised hashing method. Thus, the conclusions in all the deep hashing
papers should be carefully re-examined
Depth Image Inpainting: Improving Low Rank Matrix Completion with Low Gradient Regularization
We consider the case of inpainting single depth images. Without corresponding
color images, previous or next frames, depth image inpainting is quite
challenging. One natural solution is to regard the image as a matrix and adopt
the low rank regularization just as inpainting color images. However, the low
rank assumption does not make full use of the properties of depth images.
A shallow observation may inspire us to penalize the non-zero gradients by
sparse gradient regularization. However, statistics show that though most
pixels have zero gradients, there is still a non-ignorable part of pixels whose
gradients are equal to 1. Based on this specific property of depth images , we
propose a low gradient regularization method in which we reduce the penalty for
gradient 1 while penalizing the non-zero gradients to allow for gradual depth
changes. The proposed low gradient regularization is integrated with the low
rank regularization into the low rank low gradient approach for depth image
inpainting. We compare our proposed low gradient regularization with sparse
gradient regularization. The experimental results show the effectiveness of our
proposed approach
A parallel space-time domain decomposition method for unsteady source inversion problems
In this paper, we propose a parallel space-time domain decomposition method
for solving an unsteady source identification problem governed by the linear
convection-diffusion equation. Traditional approaches require to solve
repeatedly a forward parabolic system, an adjoint system and a system with
respect to the unknowns. The three systems have to be solved one after another.
These sequential steps are not desirable for large scale parallel computing. A
space-time restrictive additive Schwarz method is proposed for a fully implicit
space-time coupled discretization scheme to recover the time-dependent
pollutant source intensity functions. We show with numerical experiments that
the scheme works well with noise in the observation data. More importantly it
is demonstrated that the parallel space-time Schwarz preconditioner is scalable
on a supercomputer with over processors, thus promising for large scale
applications
Two-level space-time domain decomposition methods for unsteady inverse problems
As the number of processor cores on supercomputers becomes larger and larger,
algorithms with high degree of parallelism attract more attention. In this
work, we propose a novel space-time coupled algorithm for solving an inverse
problem associated with the time-dependent convection-diffusion equation in
three dimensions. We introduce a mixed finite element/finite difference method
and a one-level and a two-level space-time parallel domain decomposition
preconditioner for the Karush-Kuhn-Tucker (KKT) system induced from
reformulating the inverse problem as an output least-squares optimization
problem in the space-time domain. The new full space approach eliminates the
sequential steps of the optimization outer loop and the inner forward and
backward time marching processes, thus achieves high degree of parallelism.
Numerical experiments validate that this approach is effective and robust for
recovering unsteady moving sources. We report strong scalability results
obtained on a supercomputer with more than 1,000 processors
- …